Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
Speech classification model based on improved Inception network
Qiuyu ZHANG, Yukun WANG
Journal of Computer Applications    2023, 43 (3): 909-915.   DOI: 10.11772/j.issn.1001-9081.2022010047
Abstract339)   HTML10)    PDF (1970KB)(98)       Save

Aiming at the complicated process of extracting audio features by traditional audio classification models, and problems of the existing neural network models such as overfitting, low classification accuracy, and vanishing gradient, a speech classification model based on improved Inception network was proposed. Firstly, in order to avoid the vanishing gradient while increasing the depth of the network, the residual skip connection idea in Residual Network (ResNet) was added into the model to improve the traditional Inception V2 model. Secondly, the size of the convolution kernel in the Inception module was optimized, and the deep features of Log-Mel spectrogram of the original speech were extracted by using different sizes of convolutions, so that the model was able to select the appropriate convolution to process the data through self-learning. At the same time, the model was improved in depth and width dimensions in order to increase the classification accuracy. Finally, the trained network model was used to classify and predict the speech data, and the classification result was obtained through the Softmax function. Experimental results on Tsinghua University Chinese speech database THCHS-30 and ambient sound dataset UrbanSound8K show that the classification accuracy of the improved Inception network model on the above two datasets is 92.76% and 93.34% respectively. Compared with models such as Visual Geometry Group (VGG16), InceptionV2 and GoogLeNe, the classification accuracy of the proposed model is the best, with a maximum increase of 27.30 percentage points. It can be seen that the proposed model has stronger feature fusion ability and more accurate classification results, can solve problems such as overfitting and vanishing gradient.

Table and Figures | Reference | Related Articles | Metrics